Circular effects in representations of an RNA nucleotides data set in relation with principal components analysis

نویسندگان

  • T. H. Reijmers
  • R. Wehrens
چکیده

During the last few years, the main reason for using molecular structure databases has changed. Instead of using databases as a storage medium, databases now are also used as a source for data-mining applications. The large number of objects and variables in these databases induced that besides univariate techniques, multivariate techniques are also applied to search for knowledge hidden in the data. A popular multivariate technique that is used to explore the underlying structure in data is Ž . called principal component analysis PCA . Because structure data are often represented as torsion angles and PCA is not originally designed to deal with this kind of circular data, the outcome of PCA experiments can be misleading. This article describes several alternative representations of circular data and its effect on the outcome of PCA experiments. A worked example is given using a database of RNA nucleotides. q 2001 Elsevier Science B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Patterns Prediction of Chemotherapy Sensitivity in Cancer Cell lines Using FTIR Spectrum, Neural Network and Principal Components Analysis

    Drug resistance enables cancer cells to break away from cytotoxic effect of anticancer drugs. Identification of resistant phenotype is very important because it can lead to effective treatment plan. There is an interest in developing classifying models of resistance phenotype based on the multivariate data. We have investigated a vibrational spectroscopic approach in order to characterize a...

متن کامل

Patterns Prediction of Chemotherapy Sensitivity in Cancer Cell lines Using FTIR Spectrum, Neural Network and Principal Components Analysis

    Drug resistance enables cancer cells to break away from cytotoxic effect of anticancer drugs. Identification of resistant phenotype is very important because it can lead to effective treatment plan. There is an interest in developing classifying models of resistance phenotype based on the multivariate data. We have investigated a vibrational spectroscopic approach in order to characterize a...

متن کامل

Persian Handwriting Analysis Using Functional Principal Components

Principal components analysis is a well-known statistical method in dealing with large dependent data sets. It is also used in functional data for both purposes of data reduction as well as variation representation. On the other hand "handwriting" is one of the objects, studied in various statistical fields like pattern recognition and shape analysis. Considering time as the argument,...

متن کامل

Choosing the Best Hierarchical Clustering Technique Based on Principal Components Analysis for Suspended Sediment Load Estimation

1- INTRODUCTION The assessment of watershed sediment load is necessary for controling soil erosion and reducing the potential of sediment production. Different estimates of sediment amounts along with the lack of long-term measurements limits the accessibility to reliable data series of erosion rate and sediment yield. Therefore, the observed data of suspended sediment load could be used to ...

متن کامل

Circular RNA: features, functions and their correlation with diseases especially cancer

In early 2012, the world of science saw a fascinating discovery called circular RNA as a transcription product of thousands of genes in mice and humans. These circular RNAs have recently been grouped as the encoding RNA in an independent group that their remarkable difference with other RNAs is that these RNAs are not linear, in which two ends connect with a covalent connection creating a loop-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000